Structure Discovery in Sequentially-connected Data Streams
نویسندگان
چکیده
Much of current data mining research is focused on discovering sets of attributes that discriminate data entities into classes, such as shopping trends for a particular demographic group. In contrast, we are working to develop data mining techniques to discover patterns consisting of complex relationships between entities. Our research is particularly applicable to domains in which the data is event driven, such as counter-terrorism intelligence analysis. In this paper we describe an algorithm designed to operate over relational data received from a continuous stream. Our approach includes a mechanism for summarizing discoveries from previous data increments so that the globally best patterns can be computed by examining only the new data increment. We then describe a method by which relational dependencies that span across temporal increment boundaries can be efficiently resolved so that additional pattern instances, which do not reside entirely in a single data increment, can be discovered. We also describe a method for change detection using a measure of central tendency designed for graph data. We contrast two formulations of the change detection process and demonstrate the ability to identify salient changes along meaningful dimensions and recognize trends in a relational data stream.
منابع مشابه
ارائه روشی پویا جهت پاسخ به پرسوجوهای پیوسته تجمّعی اقتضایی
Data Streams are infinite, fast, time-stamp data elements which are received explosively. Generally, these elements need to be processed in an online, real-time way. So, algorithms to process data streams and answer queries on these streams are mostly one-pass. The execution of such algorithms has some challenges such as memory limitation, scheduling, and accuracy of answers. They will be more ...
متن کاملOn-line Hybrid System of Computational Intelligence for Data Streams Adaptive Processing
Nowadays computational intelligence methods are widely spread in different tasks solving in Data Mining under uncertainty, nonlinearity, and disturbed by different type of stochastic, chaotic noises conditions. In the paper the hybrid neuro-neo-fuzzy system of computational intelligence is proposed. This system is distinguished by the computational simplicity, the learning process high speed an...
متن کاملNeed For Speed : Mining Sequential Patterns in Data Streams
Recently, the data mining community has focused on a new challenging model where data arrives sequentially in the form of continuous rapid streams. It is often referred to as data streams or streaming data. Many real-world applications data are more appropriately handled by the data stream model than by traditional static databases. Such applications can be: stock tickers, network traffic measu...
متن کاملSPAMS: A Novel Incremental Approach for Sequential Pattern Mining in Data Streams
Mining sequential patterns in data streams is a new challenging problem for the datamining community since data arrives sequentially in the form of continuous rapid and infinite streams. In this paper, we propose a new on-line algorithm, SPAMS, to deal with the sequential patterns mining problem in data streams. This algorithm uses an automaton-based structure to maintain the set of frequent se...
متن کاملFrequent Subgraph Mining from Streams of Linked Graph Structured Data
Nowadays, high volumes of high-value data (e.g., semantic web data) can be generated and published at a high velocity. A collection of these data can be viewed as a big, interlinked, dynamic graph structure of linked resources. Embedded in them are implicit, previously unknown, and potentially useful knowledge. Hence, efficient knowledge discovery algorithms for mining frequent subgraphs from t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- International Journal on Artificial Intelligence Tools
دوره 15 شماره
صفحات -
تاریخ انتشار 2006